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Introduction 


The problem originated from neutrino physics [^. 

We consider a set of vectors in 3-dimensional Euclidean space 

We make a parametric assumption that this set is a sample of independent 
identically distributed variables, where a parameter of the distribution is a 
direction. 

As our estimator we take the direction of the arithmetic mean of the 
sample. This allows a simpler mathematical treatment compared to other 
possible estimators, since the sum of variables corresponds to the convolution 
of their pdfs, and this can be calculated in a standard way using the Fourier 
transform. 

Our goal is to find the distribution of the estimate in order to calculate 
confidence sets on the sphere, which we consider the precision of the estima¬ 
tor. We study both the exact case for finite samples and asymptotic cases, 
e.g. for number of events large. 

Our parametric models are the exponential distribution, section and 
the normal (Gaussian) distribution, section]^ 

These results are new compared to previous studies. The author was un¬ 
able to find directional results for the exponential distribution. In physical 
articles only the limiting case of large number of events is usually consid¬ 
ered Q. Mathematical literature on directional statistics usually deals with 
distributions on spheres [^, while in our case we have complete 3-dimensional 
information. 

This work was written for those who are not necessarily statisticians or 
mathematicians but have met the problems treated here. Therefore the au¬ 
thor attempted to use only the basic facts from mathematical undergraduate 
courses and introduced in detail more advanced notions when they were used. 
All the references used in this work can be found on the internet. 

Convolution of pdfs using the Fourier transform 

The probability density function (pdf) /(r) of the sum of two independent 
variables ri -|- r 2 in is given by the convolution of their pdfs: 

fr.+vM = (/i * / 2 )(r) = [ /i(r')/ 2 (r - r') dr' 

We denote the Fourier transform^oi a function /(r) as 

^this is similar to the characteristic function in probability theory, the latter is complex 
conjugate and without the factor {2tt)2 
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/(p) 


f e“*P’' 



and the inverse Fourier transform of / as / (therefore f{x) = f{x)). With 
this dehnition the inverse Fourier transform operator is complex conjugate 
to the direct Fourier transform operator. Then 


r p-iW r 

JiRd (zTT)"/ JjRd 

= {27,f‘j(p)g[p)^ ( 1 ) 

This is a well-known property of the Fourier transform, that it maps the 
convolution of two pdfs to the product of their Fourier transforms. 

Therefore the Fourier transform of the convolution of n distributions / is 

/n(p) = (/(p)r(27r)'=^ (2) 
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1 Exponential distribution 


1.1 Introduction 

Exponential distribntion appeared in the anther’s stndies connected to 
the problem in [^. The observed pdf at large x deviations was similar to 
~ e~T. To be spherically symmetric the pdf should be proportional to ~ e~T. 
To calculate the normalisation factor, we take the integral 


^oo /*oo /*oo 




e ' ‘ dx dy dz = Att I r^e ‘ dr = Sirl 


' —OO t/ —OO t/ —OO 


Hence the offset exponential probability density function in 3-dimensional 
space is 


1 \/ (^-^od+(y-yod+(^-^od 

fe{x,y,z\xo,yo,zo) = -^e ^ 


( 3 ) 


Fourier transform of /e and its convolutions 

In order to calculate the Fourier transform of /e, we calculate the integral 


-d^r 


e-*Proe-*Pr'e-T d^r' 


roo rn , 

2^g-ipro / / e-*P"'“'^®-Tsin0d0r'2dr', 

Jo Jo 


the inner integral on 6 


f ^-ipr' cos e d COS 0 = ) = — (e*P^' - ), (4) 

—ipr' ipr' 

and the outer integral on r' with one of the complex conjugate exponents 


dr' = 


r = 


i-tpj {j-ip)^Jo 

combining the two conjugate integrals, we obtain 


re dr = 


Cj-tpr 


ip 


ipY 


- = “71 




Aip 

~T 


ip{j - ip^U+ ipy ip{h+p‘^)‘^ Kw+p'^y' 
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therefore, taking into account the normalisation factor and the factor 

3 

( 27 r )“2 from the Fourier transform, 


/e(p) = 


Svr 


-*pro — 


-ipro 


( 5 ) 


8vr/3(2vr)f/(^+p2)2 (27r)i(l +/V)" 

From |2] and |5] we can learn the Fourier transform of the convolution of n 
exponential distributions: 


p-iV^on 

fnip) = -3-• 

(27r)i(l + /V)"’^ 

Thereby the convolution of n exponential distributions is 


( 6 ) 


/n(r) = fn{p) 



1 

(27r)3 


ip(r-ron) 

_d^P 

(1 + /2p2)2n ^ P’ 


choosing spherical coordinates with the z axis along r — nro, the exponent 
becomes e*pk-«ro|cos6»^ using |^, 


/n(r) = 


^oo gzpr|r—nrol _ ^—ipr\r—nrQ\ 


(27r)2 


7o ip|r —nrol {1 + Pp^Y'^ 

1 f°°psm{p\r — nro\) 


dp = 


27r2|r — nrol Jo {1 + Pp‘^Y'^ 

Substituting inside the integral P = f, 

1 


dp. 


/n(r) = 




27r2/2|r — nrol Jo (1 + 


2n 


dx. 


( 7 ) 


Distribution of the sample mean En 

A statistic useful in practical applications is = ^ , the arithmetic mean 
of r. We can calculate the probability density function of the random 

variable r„ and, using the conservation of probability under the change of 
variables Enirn) d^r„ = /„(r) d^r , we obtain Enijn) = n^/n(nr„) 


E ir ) = _ - _ 

27r2/2|r^_ro| 


xsinrx ^l’'V°l 


(1 + x^)^” 


dx. 


( 8 ) 


The integral can be calculated analytically using the formula 3.737(2) 
from [a > 0, Re/9 > 0]: 
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a;sin(aa;) dx 


nae ^ (2n — fc — 2)!(2a/3)^ 

T 2-^ 


Jo (x^ + 

Combining and , we obtain 


_ ^ 2‘^^n\k\{n — k — 1)\ 


TT 

2® 


—a0 


[n = 0, /? > 0] 


( 9 ) 


e ■'“I (4n - 4 - fc)!(2f |r„ - roD* 

-^r?, (I'r). ) 7T7 TTT” 7TZ ITT ^ ^ 


vr/^ 2^"’ ^(2n —1)! “ 


fc =0 


k\{2n-2-k)\ 


In the case of n = 1 the sum in is equal to 1 and we obtain 


1.2 Properties of 


( 10 ) 


In this subsection we study representations of other than 10 and its 
connection with hypergeometric functions. 

Calculation of the integral 

The integralfor a,/? > 0 can be easily reduced to that with /? = 1: 


xsm ax 


1 x=Py 1 
-dx = 


ysin{a(3y) 


dy 


n > 


Jo (x2+/32)n+l I 32 n (^2 + l)n+l 

0 The integral using integration by parts can be transformed to 


( 11 ) 


r°° xsin(ax) 

/o (x2 + 1)^+1 


dx = 


1 t ^ 1 

— sinlaxl — 

2n ^ E^ + x^y 


cos ax 


2n Jo (x2 + 1)' 


-dx = 


cos ax 


-dx. 


2n Jo (x2 + 1)' 

To compute this integral we apply complex analysis. We express 


( 12 ) 


f°° cos(ax) , 

/ -—dx 

Jo (x^ + 1)" 



^iax 

-^dx, 

(x2 + 1)^ 


(13) 


the integrated function is meromorphic on whole complex plane and has 
poles at ±h We close the integration contour in the upper half-plane near 
inhnity, where the integral is zero, and deform the contour into a small circle 
C of radius r around the pole +i- We parameterise z = i + then 
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giaz r2TT ^-a+iare'^l’ g-a r2TT giare''^g-i(n-l)(/)g|^0 

{z^ + ~ Jq {2ire^‘t> + r‘^e‘^^<t’y ~ 2^(ir)”-i Jq (1 + 

(14) 

In order to compute that integral, we decompose the integrand with re¬ 
spect to power series of and use the identity 



-271 


if n 7 ^ 0, 
if n = 0. 


(15) 


Since r can be made sufficiently small, we can decompose the denominator 
using the binomial formula; 


(i + i)-” = X^ 


A:=0 



—n 

k 



—n 

k 


(—n)(—n — 1)... (—n — k + 1) 
k\ 



(16) 


(17) 


"27r giare*'^g-i(n-1)0^0 

) (1 + 



27r(fr)"' ^^^fn + k — l^ (2a) 


2’"“^ ^ V ^-1 J(n-l-k)! 


271 (ir 


\n—l 


n—1 




fc =0 


k\ 


Finally, we obtain 


coslax) Tie 

-ax = 


/o + 1)' 


I 2n — 2 — k\ (2a) 
22 n-T n — 1 ) 


k=0 


k\ 


(19) 
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n = 0 As in the previous case, we transform 


xsinfax) 1 

—- ax = - Ini 

a;2 + 1 2 


"OO 


a;2 + 1 


dx 


( 20 ) 


As earlier, we close the contour of integration in the upper half-plane, 
deform the contour to the pole +i and parameterise the integration variable 
z = i + 


'•27r 


2ire^'t’ _l_ ^2^214) 


^2-k 


d0 = e-“ 


i + re*‘^)e“’ 
2 + 


pi./. 


-d0 


( 21 ) 


The integrand is analytic on [0, 27r] and thus can be expressed as a series 
of non-negative powers of from which only the zeroth term contributes 
to the integral and gives ni. Therefore 

xsinfax) , vr 
/ ——-dx = -e . 

Jo x^ + 1 2 

Proof that is properly normalised 

The integral ii^„(r„) d^r„ is equal to 1, since En is a properly nor¬ 


malised pdf. Here we calculate it explicitly using the formula 10 


After the parallel shift of r„ to ro, which doesn’t affect the total integral, 
after changing to spherical coordinates and having integrated on the polar 
angles, we obtain the equality to be proved 


n 




(4n - 4 - k)!(2fr„)'‘ 

-1- 


Q ^ k!(2n — 2 — k)! 


Using the integral x"'e ^ dx = r(?7, -|- 1) = n!, this transforms to 
1 


2n-2 


^ (4n-4 - fc)!2^(fc + 2)! _ ^ 


( 22 ) 


24.,-s(2„ _ 1)1 ^ k\(2n -2-k)\ 

This equality holds for n = 1. For n = 2 the left-hand side 

1 /4!2! 3! 2-3! 2!2T4!\ 1, 

2^3! V 2! 1! 2! ) 2 ^^ ’ 

In the remaining of this subsubsection we prove the identity Different 
techniques for calculation of closed forms of summations involving binomial 
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coefficients can be fonnd in [^. Here we use the method of hypergeometric 
functions. 

The Gaussian (or ordinary) hypergeometric function 2 F 1 (a, 6; c; z) is a 
special function represented by the hypergeometric series (|^, 7.2. 1(1)) 


2 Fi(a,b;c;z) 


= 1 + —^ + 
c 


a 


This can 


c(c + l) 2! 

(a + l)(a + 2)b(b + 1)(6 + 2) z^ 
c(c + 1) (c + 2) 3! 

be rewritten using the rising factorial or Pochhammer symbol 


(a)o — 1, 

(a)n = a{a + 1)... (a + n - 1), 


then 


2 Fi{a,b; c; ^ 


fc =0 


(c)t k\ 


(23) 


(24) 


In case when a or 6 is a negative integer, only a hnite number of terms is 
non zero. Using Pochhammer symbol we can express 


n\ 


{n — k)\ 


= n{n -1) ...{n-k + l) = {-lf{-n)k, 


[n — k)\ = 


n\ 


-lY{-n)k 


(fc + 2)! = l-2-3...(fc + 2) = 2-(3)fc, 
The sum in [22] can be rewritten as 


(25) 

(26) 


E 

A;=0 


(4n - 4 - k)\{k + 2)! 2^ (|^) (4n - 4)! (-2n + 2)^(3)^ 2^ 


(2n - 2 - k)\ k\ 


(-Z 


(2n-2)!^ (-4n + 4)fc k\ 


(Afi — 4V 

= ‘^ \2n-2)\ + 2, 3; -4n + 4; 2) (27) 

This hypergeometric function can be calculated using the formula 7.3.8(6) 
in j^: 


2 Fi{-n,a; -2n; 2) = 2 


2 n n\ fa + 1 


(2n) 


(28) 


Substituting into we obtain the original equality 
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1.3 CDF£;(cos6*(rj^)) 


The original problem in was to find the half of the opening angle of 
the cone whose origin is at (0,0,0) and the axis is the true direction, within 
which the given confidence level d (e.g. 68%) of events are contained. 

When we work in spherical coordinates {r,(j),6), where 6 is the angle to 
the true direction fq, we can compute the CDF as a function of 9. Then the 
confidence interval 9ci is CDF“^(c/). This can be calculated numerically if 
we know the CDF. 

In this subsection we calculate the cumulative distribution function of the 
polar angle 6 of the statistic r„. For more compact formulae, we calculate 
the CDF as a function of cos6'. 

Changing to spherical coordinates in and integrating on </> and r. 


n 


CDF£;(cOS0) = — 
r 


rl ^-fy/rl+r'^-2r„rocos0' 

2‘"-2(2n - 1)! 


2n—2 


(4n-4 - fc)!(2f7r2+r2 - 2r„rocos0')^ ^ 

y -MAtZ-Z-TTi-dcosb' (29) 


k=0 


k\{2n-2-k)\ 


Integral on cos^' 

For shorter notation we define 


a = rl + rl 
b = 2r„ro 
c = ? 


(30) 


We assume & % 0, that is we exclude the point r = 0 from the integration. 
We calculate 


e-^^^{cVa - bxfdx' = 


z = c\/ a — bx' 


dx' = 


2zdz 

bc^ 


2 

bd 


rc\/a—b 

/ dz 

J c^/a—bx 

(31) 

dx = -e-^k\ V ^ 

^ i\ 

(32) 


i=0 


Combining 
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^-^^/rl+r^o-^r„rocose' ^rl + - 2r„ro COS0'^ dcos0' 

/c+1 ^ 


(2 


k+l y\/r'2+rg-2r„ro 


ri^rnTo 

l\k + l)\ ^ 1 


i=0 


‘ \/^n+’'o“2r-„ro cos 6» 


TL^Vq 


i=0 


r„ 


j\rn-ro\] - 


_ ]Lf,-J^yrl+r^-2r„rocose y2 i ^2 _ 



+ Tq — 2r„ro cos 9 


Integral of 1^ over r 


(33) 

(34) 

(35) 


Integrating 33, we multiply it by from the Jacobean, and expand the 
modulus 


rn.e 


-j\rn-ro\ (^j\rn - ro|) dr^ = (36) 

(y(?^o - ?^n)) dr„ (37) 
dr„ (38) 


r„.e 


+ / rnC 
Jro 


-j{r„-ro) 


n 

jK- 


The integral from 0 to tq is easily calculated using 32 


J (ro - Tn) = X, 
rn = ro- ^x, 
dVn = —-dx 

I 


n 


-X -I ^ 

roe ^ ^ 


j=0 


[ - (ro--x] e-^x^dx^^^ 

'fro ^ V ri 


12 rjro 

- — / x^+^e-^dx = 

Jo 


jro 


^ -I ^ p 

=-ro*!-roe ^-^ ' 

n n ^^ ?! to 

j=o 


T^O 


-— / x*+^e-*dx (39) 
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We have retained the last integral, for it is useful in what follows. 
The integral from Tq to inhnity is taken similarly: 


I 

n 


e (y(r„ - ro)) dr,. 


(2 + l)! + -rod (40) 
n / n 


Adding [3^ and 40, we obtain 


7 7 * (^rnV 72 72 rfro 

= 2d—ro -roe“^^°d ^ —h (f + 1)!^-^ 

n n 'i' "" "" 


i=o 




(41) 

In case of a large number of events n or, more precisely, when jTo S> 1, 

71 ^ 

the exponent is much smaller than any power of jTq, and 


, I 

~ 2 d — tq. 
n 


(42) 


This corresponds to the case when most of the integral]^ is accumulated 
in a neighbourhood of Vn = tq. 


Integral of 34 


over Tr, 


In this subsubsection we calculate the integral 


r„e i \/^^+^0 2r„rocos6» ^ ^^2 ^2 _ 27 -,,rQ COS 6*^ dr„. 


(43) 


We introduce 


Tl / 

X = -jyrl + rl- 2r„ro cos 9, 

then we can rewrite 

^ = (r„ — ro cos6*)^ + rQ(l — cos^ 6*) 

Change from to x is a change of coordinates if r„ is uniquely dehned 
through X and vice versa. Therefore we should separately consider the regions 
> "'"o cos 9 and < tq cos 9. The point = tq cos 0 on a line of integration 
corresponds to the maximum of the pdf on that line (this is the nearest point 
on the line to the the mode of the distribution). 
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cos 6 * > 0 


fn > fo COS 6 


P 

'T'n = ro COS 6 + \l ~ ’" 0(1 ~ COS^ 


n 


drn = 


^xdx 


-xdx 


sj ~ ''" 0(1 ~ cos^ 9) ~ “ cos‘^0) 


' ro cos 9 


TnC "V’^n+r'o 2r„rocos6» + Tq — 2r„ro COS 6*^ dr^ 


I ^ /•“ X*+l 

= —ro cos ^ - 

n JiiroVi=^ - fr^{l - cos2 0) 

/2 /•“ 


e dx + 






yroVl—cos^ 9 


(44) 

(45) 


The integral converges at the lower limit for cos 6 ^ 7 ^ 1 because the 
integral ^Ji^_a 2 ~ la converges. The integral 


"OO (:r=achi;) 


ch'+V 


can be expressed through the modified Bessel function (8.407 in 
using the formula 3.547(4) from [^: 


exp (—/Scoshx) cosh ( 7 a;) dx = [(3) 


[Re /3 > 0] 


since ch"'a: can be expressed as a sum of ch(fca;) using 1.320(6) and 1.320(8) 
from j^. 


0 < r„ < ro cos 9 


fP 

\ ~ ''’ 0(1 ~ = ''"0 cos 9 — Tn, 

[P 

fn = COS d — sJ ~ '^ 0(1 “ 
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The limits are converted to cos'^d^ differential dr„ is 

the same as in the previous case > tq cos 6 except for the negative sign, 
which we omit changing the upper and the lower limits of the integration. 


r*rQ cos 6 


TnC ? \/'’n+^0 2r„ro cos6l ^^2 _j_ ^2 _ 2r„ro COS dr„ = 


r»'0 


yroVl—cos^ 6 


To COS 6 — \l -^x'^ — rQ(l — cos^ 6) 




I 




X 

= —To cos 9 I — ! - 

n - fr^il - cos2 9) 


r.i+1 


f /-T 


\J x"^ — ^^ 0(1 — cos^9) 

e~^ dx 

(46) 


ro 


x^+^e-^ dx. 




y roVl—cos^ Q 
Adding and to gives 


l>cos d>0 


I 


X 


i+l 


= —ro cos 9 / — 1 = _ 

^ yx^ — ^ro(l — cos2 6*) 


f 


e dx H—^ / x*+ e dx+ 

^ro 


n? /n. 


I 


X 

+ 2 —rocos 6 * / — ! - 

n JnroVT=M _ ^^ 2^1 _ cos2 9) 


r.i+1 


e ^ dx. 


(47) 

(48) 

(49) 


COS 9 < 0 


In this case r„ is always bigger than rg cos 9 and the integral 


becomes as in the case of 


The lower limit is changed from = 0 to x = yrg, 



n 


ro cos 9 




X 


i+l 


y^r§(l - cos‘^9) 


e ^dxH—r 


x*+^e-^ dx. 


rT'O 


(50) 


Note that the only difference between 50 and 47 is 49 


We can combine the results for cos 6* < 0 and for cos^ > 0 using the 
Heaviside step function: 
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0(a:) 


1 X > 0, 

0 X < 0. 


(51) 


7 r°° rfi+i 

= —To cos 9 / . 

^ yx^ — ^rQ(l — cos^ 0) 

/•t^o 


j 2 roo 

e~^ dx -\—r / x^^^e^'^dx 




T^O 


^i+l 


+ 20 (cos 6 *)—ro cos^ / — ^ ^ ^ 

JfroVi=^ - fr^{l - cos 2 0 ) 


e dx (52) 


CDF(cos0(r„)) 

Combining the calculations for the CDF(cos 6 '), 


CDF(cos») E® {4n-4-ky.2'‘{k+iy. 


I ro 


k=0 


k\{2n-2-k)\ 


nOQ fc+l ^ / 


e-?hn-ro| 


^y|r„ —rolj —e ?\/»’n+»’o 2 r-„rocose ^ ^2 _ 2 r„ro cos 0 


I ro 


2n—2 

E 

/c=0 


fc +1 


|41|52| 72^ 1 (4?1 — 4 — fc)!2 (fc + 1) 1 


(2n-2-fc)! 


i=0 


2z!-ro-roe * °z! ^ + (^ + 1)!^-^ / 

^ j! n2 n2 Jo 


Fo 


x*’''^e "^dx (53) 


n 


n 


^ 2+1 


■hocos^" , ^ 

^ V ~ ~ COs2 6*) 


72 ^OO 

6 “"^ dx- - x*’''^e“'^ dx (54) 


n2 I 


i+1 


I X 

- 20 (cos 6 *)—ro cos^ / — ^ ^ 

_ ^^2(1 _ cos2 0) 


e-^ dx 


The line [53] corresponds to 33 while M and below come from ^ 


The last terms in 




53 


53 


and 


54 


sum up to —^T{i + 2 ) and cancel with 


Thus we obtain the hnal answer; 
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2n—2 


CDF^(cos()(r„. r„)) = | 2(k + 2) 


A:=0 


(2n-2-fc)! 


fc+i 


e-t-o 


^(i + 2-i)(M-cos9 r 


Y^fc +1 
2 /i =0 i! 


i =0 


:e dx 


— 20 (cos 0) cos 9 


-rro 


i^’o y — 72 -?’o (1 ~ cos^ 6*) 

Y^fc +1 

A-/i =0 i! 


=r-ox/I=^ y^x2- fr2(l-COs2 0) 


e dx (55) 


The first term in 55 is equal to 2^” ^(2n — 1)! due to 22 


1.4 Approximation of En and 9{cl) for n large 


As in 


we use 


n , 


as = yTn - ’^ol 


for briefer notation. We also introduce El 


F„ = — 


r°° xsinfax) 


a Jq (1 + x'^y 
then the pdf can be expressed as 

1 


dx, 


Let also 


En — 


Ibvr^/s 


F2n- 


(56) 


(67) 


( 68 ) 


then due to 


/ cos(a;r) 


( 59 ) 


P _ ^ T 

” 2 (n-l)”-'' 


(60) 


Even though we have the exact formula]^ for J„, we need to hnd a simple 
estimate for that when n is large. 

The function (1 has its maximum at a; = 0 and rapidly decreases 


to zero when x and n increase. Thus the integral 59 can accumulate most of 


^the subscript E and the dependence F(a) are omitted throughout this section 
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Figure 1; for different a values. The upper curve represents 


its value near x = 0. For that to happen, cos(ax) should be positive when 
(1 + is large enough. 

Small a (large In{o)) correspond to the maximum of the pdf, while large a 
(small In{(^)) correspond to the tail of the distribution. If we are interested in 
conhdence levels neither too close to 1, nor too close to 0, we should explore 
the region where /n(a) takes intermediate values. 

On hgure[^ three different cases for the parameter a are shown. On l(a 


is small, and the integralis close to its maximum (as if a = 0, cos(ax) = 1). 
On l(b)|a is large, the cos(ax) oscillates fast and the integral]^ is small. To 


estimate the parameter of interest a*, we dehne it such that the hrst zero of 


cos(ax) is when (1 + x" 


we set it to £g. 1(c) 


is neither too large, nor too small. For an estimate 


Solving = I gives x^ = 2- - 1 and 


y/n 


( 61 ) 


T-y/n 


a* such that cos(a*x*) = 0 is equal to a* = ^ ~ 2 yT 7 ^' 

We can change 2 from the example to another number A , then a* changes 


to 


TT\/n 


which depends very weakly on A as A grows. Therefore a of interest for 
our problem is less than or of the order of 


n. 


The derivative w.r.t. a of the integrand of fo. 


—X sin(aa:) 


(l+a;2)" 


< 


The 


integral (i_|_^ 2 jn dx converges for n > 1, therefore, accorc 
known theorem M, we can differentiate 60 on the parameter: 


(l+a;2)" • 

ing to the well- 
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(4)1 = 


xsinlax) , \m 
da: = ■ 


'o (l + a:2) 


2 (n- 1 ) 


4—1- 


Now we derive a recursion formula for In- 


L. = 


dx = 4+1 + 


f°° (1 + x‘^) cos{ax) 

Iq (1 + x‘^Y+^ 

f°° x^ cosfaa;) , 1 1 

Iq (1 + x'^Y^^ 2n (1 + xY 


x^cos(ax) 


Y (1 + a;2)^+i 


dx, 


-X cos (ax) 
). '' ' 


+ 


1 r°° 

2n L 


cos(ax) — axsm(ax) , 1 ^ 

—dx = —4 - 


(1 + x^)^ 


2n 4n(n — 1) 


4 — 1 ) 


therefore 


(63) 


4+1 — I 1 ~ ~ I In + 


4 — 1 - 


2n) ' An{n — 1) 

In+I = In + o ('«“') for a < \/n , 


(64) 


(65) 


and substituting pq into 63, we can solve the differential equation up to 


the terms of the order of 


(1 + 0 ("-■)). 

4(a) = 4 ( 0 )e" 4 (i+c’(” 0) = 4 (o)e "4 (l + O (n“^)) . 
4 ( 0 ) can be found from the exact formula [T^ 

22-1 ((n- 1 )!) 2 - 

We can approximate 4(0) for large n using Stirling’s formula (6 


n! = y/YYn (— 
\e 


1+ O ( - 

n 


then 


( 2 n- 2 )! _ 42 ( 2 n- 2 ) 2 — 2 ^ 2^—2 

((n-l )!)2 427 r(r 44 J (ii - - 1 ’ 

therefore 


( 66 ) 


(67) 


( 68 ) 
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( 69 ) 


and 





(70) 

[60j70|n2-/7f _a£ . X 

^ e (1 + 0(n ^)) , 

(71) 


(72) 


Substituting og from [5^ we express En through the original parameters: 

-e 8 i 2 “ (1 + 0{n~^)), 


Enij^n) 


3 „ 

77-2 n\r„-rQ^^ 


(27r)58/3 


(73) 


and obtain in the leading order the normal distribution 77 with 

21 

0'E,n — 


n 


(74) 


which corresponds to the convolution of n normal distributions with 
Ce,! = 2/. In the limit of the number of events very large ^ 1 and 0 -C 


1), one can apply the formula 100 to estimate a confidence interval 0: 


2 /^-21n(l -c/) 


n 


(75) 
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2 Normal distribution 


2.1 Introduction. Distribution of the sample mean Gr 

A one-dimensional normal (or Gaussian) distribution has the pdf 


gG) = 




-_e 


( 76 ) 


TTcr^ 


With this dehnition the variance E[(a: — Xq)^] = cr^. For d dimensions the 
spherically symmetric multivariate normal distribution is 


^(r) = 


(27r(j2 


(r-rp) 

-e 2 <t^ 


(77) 


The Fourier transform of the multivariate normal distribution 


(r-rp) 


9{x) = 


(27r)2 (27rcT2)2 


-d'^r = 


2 2 
P cr 7 


(27r)2 jRd (27rcr^) 2 


2 h'^r = 

1 p 


(2vr; 


(78) 


This is similar to the normal distribution with the variance except 
that it is not properly normalised, since the Fourier transform preserves the 
L^-norm, but not the L^-norm. For the Gaussian distribution the direct 
Fourier transform coincides with the inverse Fourier transform. 

The Fourier transform of the convolution of n d-dimensional Gaussian 
distributions (|^ 


ffn(p) 


(2r) 


(n — l)d 
2 



2 

pncT 

2 


1 

-xc 

(27r)5 


(79) 


The pdf of the convolution of n Gaussian pdfs , which corresponds 
to the sum of n normally distributed variables, can be obtained by taking 
the inverse Fourier transform using (78) : 


9nG) 


1 _ r2 

-— g 2710-2 _ 

(27rncT2) 2 


(80) 


This is again the normal distribution with the variance re-scaled to ncr^. 
We use the average sum of Gaussian vectors r„ = ^, and we shift the center 
of the distribution to rn; then the standard deviation becomes 

6” y/n 
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Gn{rn) 


d 

n2 


(2 


_ n(rri-Tf)y 

re 2 <t 2 


TTCT^ 


( 81 ) 


2.2 CDFG(cos^(r„)) 

In this subsection we work with d = 3. For calculation of integrals in 
spherical coordinates in arbitrary dimension one may consult |^. 


27rn5 /■“ 2, _n(r2-2rTocose' + r2) 

COS 0) = - 3 - / r dr e ^ d cos 0 , 

( 27 r0-2)2 Jo Jcose 

the inner integral is taken easily in 3-dimensional space, 



/•I / 2 

/ 2nrro cos 0 (J 

/ e 2 ?^ dcos 6 * =- 

loose nr To 


2nrrQ 

e 2 ct 2 


2nrrQ cos 0 

e ^ 


therefore 


CDF(cos0) = 


n 


V 2 


na^ro Jo 




(82) 

To calculate the hrst term in brackets, it is sufficient to calculate the 
second term and put cos 6 = 1. We complete the square in the integral 


re 2 <t- 


■(r2-2rrocos0+r2)^^ ^ (l-cos^ 6») 


re 2 <t- 


Jr-rocosef^^^ (83) 


then the latter integral we split into two ones with r = (r — ro cos 6) + 
ro cos 6. The hrst part is a total derivative with respect to r — ro cos 0 = y, 




I _ ny I _ ny 

/ ye~^dy + roCosO / e~^dy = 

' —ro cos 6 J —ro cos 9 


nr^ cos^ (. 


= —e 2 ^:^ -|-rocos 6 * 


n 


_ ny 

e 2 ^d|/. 


' —ro cos 0 


The last term can be expressed through the error function ( [^, 8.250(1)); 

erf(a;) = f e~^^dt, (84) 

Jo 
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ny 

e 2^2 (^y = 


' —ro cos 0 


2 a 2 

n 

TTcr^ 

2n 


-^^cose 


e"* dt 


vrcr^ 


/ y/nrp 

2n 


erf 


cos 6 


(85) 


Combining and ^ into 82 


CDF(cos0) = 


n / a 


V2 


—e 2 ^^ + ro 


TTcr^ro 


n 




2 n 


Vy 2 


cr^ 


-^r2(i-co.20) / a 1^^ , r 

-e ^ “V y' I —e 2 ^^ + ro cos 0\l —— | 1 + erf 


n 


2n 


f y/nro 


\V2 


cos 6 




( 86 ) 


The first term cancels out, and we obtain the final result: 


CDFG(cos0(r„)) =^^l + erf 

- e-^ cos 0 (^1 + erf (^^^cos0^^ j (87) 

Note that CDFg depends only on one combination of parameters y/n^. 


2.3 Approximations of CDFg'(cos^) and 9{cl) 

In this subsection we consider the behaviour of CDFg (cos 0) for differ¬ 
ent values of cos 6 and parameters and the behaviour of confidence intervals 
(6* or cos 6*) for given confidence levels CDFg(cos6 *) = d. 

We introduce the parameter 


y/nro 

a = — 
y/2a 

and express the CDF as 


( 88 ) 


CDF(cos 0) = i (^1 + erf (a) - cos 0(1 + erf (a cos 0))) . (89) 
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In what follows we work with but keep in mind the expression of 
through the original parameters of the distribution n, tq, a. 

We are interested not only in limit cases, but even more in hnite statistics 
samples. We group the terms according to their orders and keep lower order 
terms explicitly. 


9 close to 0, n large 

The asymptotic representation for the error function for large argument 
is ( ( 3 ), 8.254) 


erf(z) = 1 - 






(90) 


( 2 -^) 

(where (—1)!! = 1). erf (a) tends very rapidly to 1 as a increases. There¬ 
fore the simplest approximation would be to substitute erf for 1 for large 
arguments. Thus for a S> 1, a cos 9 ^ 1 


CDF(cos«) = 1 - case + Cu 


(91) 


where 




(92) 


To express 9 from 91 is more difficult, since the exponent power (1 — cos^ 9) 
can be arbitrary. We fix CDF(cos6') = cl, move the term with 9 to the left 
side of[^ and take logarithm 


In cos 9 — (1 — cos^ 0) = In (1 — d -|- Oi), (93) 

- In (1 — sin^ 6*) — sin^ 0 = In (1 — d) -|- In H-(94) 

2 y 1 cl j 

The equation means that the results for lower d will be more precise 
than for d very close to 1, namely 1 — d should be much more than ai. 
Lower d also corresponds to smaller 9. 

In order to solve 94 w.r.t. sin6', we have to take a reasonable assumption 
sin^ 0 ^ 1; we introduce 


(5i = In (1 - sin^ 9) + sin^ 9 = 0(sin^ 9), 
we also rewrite the last term in [94] as 


(95) 
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(96) 


02 = In ( 1 + 


«! 


1 — cl 


= 0{ai) 


Then from 94, 95 96 


sin^ 6 = 


Tn(l — cl) 


+ 


/Si 


02 


1 + 2a^ i 


(97) 


The equation 94 can be solved with a better precision if we take into 
account more terms from the In series (see e.g. 0 1.511). Let 


132 = In (l 


sm 


9) + sin^ 9+- sin^ 9 = C>(sin® 9), 


then 94 transforms to a quadratic equation on sin"^ 9 


(98) 


^(32 - I sin^ ^ ~ Q ^ ^ ~ 

sin^ 6^ + 2 (l + 2a^) sin^ 9 + 41n(l — cl) + 4a2 — 2(32 = 0, 
sin^ 0 = — (l + 2a^) + ■>/(! + 20 ^)^ — 41n(l — cl) — 4q;2 + 2(32 


(99) 


The most precise formula for big a should be 99 For very big a and small 


9 we can get a simpler expression from 97 


y'-ln(l - cl) |88] y/-21n(l - cl) 


a 


i/nro 


( 100 ) 


9 close to I 

One can expect confidence intervals to be near | when a is neither too 
big nor too small. Therefore in this subsubsection we assume a ~ 1, so that 
a cos 9 1. 

The error function for small arguments can be approximated using inte¬ 
gration of the exponent series in 84 term by term; 


erf( 2 ;) = 


2 


A 



( 101 ) 
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CDF(cos6') = -(1 + erf(a)) — -— 008 0(1 H— —acosO + 71) 

2 2 -y/TT 

2 

(^1 ~ ga^cos^ej _|_ erf(acos0)), 

2 

where 71 = erf (a cos 9) - —a cos 9 = 0{a^ cos^ 9) 

y/Tl 


( 102 ) 

(103) 


To find cos0(c/) we denote 

— 1)(1 + erf(acos0)) = O(cos^0), (104) 

then 


TT 


COS 0 ( 1 H—cos 0 ) + cos 0(7i + 5i) = (1 + erf (a) — 2cl)e°‘ 


(105) 


In the leading order the solntion cos 0 of |105| is the r.h.s. of |105[ Therefore 
when we solve that equation up to O(cos^0), we chose the '+’ root: 


—1 + 4/l + -^ae“^(l + erf (a) — 2c/) —T=a cos 0(71 + (5i) 
cos 9 = - - -j-. (106) 


0 close to 71 


CDF(cos0) = ^(1 + erf(a)) — “^®™^^cos0(l + erf(acos0)) (107) 

The situation when 0 is close to vr can appear when we have a small and 
we are interested in large conhdence levels (our precision is low, but still 
allows us to exclude a region near the pole 0 = vr). In this subsubsection we 
don’t take assumptions on a, but use a Taylor series expansion of erf(z) at 
an arbitrary point: 

erf(a + A) = erf(a) + + (9(A^). (108) 

Therefore 
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erf(acos6*) = erf f—ay/l — sin^6*^ = —erf(a) H —^ae “%in^6* + e:i, (109) 

El = (9(sin"^ 6) 

To find 6{d) we solve the equation 


sin^ 6 


(— cos6')(l + erf(acos6')) = 2c/ — 1 — erf(a), 


sin^ 6 H—ln(l — sin^ 6) + ln(l + erf(a cos 6)) = ln(2c/ — 1 — erf(a)) 

^ ( 110 ) 


Using 109 


ln(l + erf (a cos 0)) = ln(l — erf (a)) + In ( 1 + 
= ln(l — erf (a)) + 


ae 


a/^(1 - erf (a)) 


-^ae “ sin 9 + Ei 
1 — erf (a) 

sin^6' + £2, (111) 


E2 = 0(sin"^ 9). 
Usingand 111, we obtain 


sin^ 9 


ae 


2 ^/n{l — erf(a)) 


= -y - ln(l - erf(a)) - £2 

+ ln(2c/ — 1 — erf (a)), 


sin^ 9 = + - 


ae 


i/T(l - erf(a)) 


In 


l-erf(a) \ A 
2 c/ — 1 — erf (a) / 2 ^ 


( 112 ) 

The r.h.s. of 112| is positive, since the argument of the last In is larger 
than 1; 1 — erf (a) > 2c/ — 1 — erf (a). However, the denominator of the 
logarithm’s argument should also be positive. 


cl > 


1 + erf (a) 


(113) 
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This means that if we want to exclude some percentage of the outcomes 
of Tn with the directions near the pole, we should chose a confidence level 
which satisfies 11131 

This is a necessary, but not a sufficient condition on cl. A more precise 
condition is that the r.h.s. of lll2l is less than 1. 

When a is small we obtain 21n ( 2 ^;^) < 1, and c/ > |(1 + 0.80. 
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Conclusions 


In this work a new approach to solving the problem in Q was proposed. 
The precision of the sample mean estimator was calculated analytically for 
the offset exponential and normal distributions both for a finite sample and 
for limiting cases. 

Even though the original applied problem concerned the exponential dis¬ 
tribution, the normal distribution was found to be also useful because of the 
central limit theorem [^. It was shown explicitly how the distribution of the 
sample mean of the exponential pdf converges near the mode to the normal 
distribution. 

While the normal distribution is tractable easier and has simpler formu¬ 
lae for the distribution of the sample mean and for the directional CDF, the 
exponential distribution has richer mathematical properties. While the dis¬ 
tribution of the convolution of normal pdfs depends only on one combination 
of parameters, for the exponential distribution this is not the case. While 
the normal distribution is stable, the exponential one is not. Geometric 
techniques were used to deal with the limiting case of the exponential dis¬ 
tribution. It was shown that the spherical projection of the sample mean of 
the exponential distribution has connections with hypergeometric functions 
and modified Bessel functions. 

In this study we didn’t concern other estimators, such as MLEs or the 
mean of the sample’s projection on the sphere. Note that in it was stated 
that the mean of unit vectors is a more precise estimator than the arithmetic 
sample mean. It might also be useful for mathematical applications to study 
the normal and exponential distributions in dimensions other than three. 
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